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Error analysis and software complexity have received increased attention in software engineering 
research over the past several years. The study of software errors has been necessitated by the 
emphasis on software reliability. Models such as the one presented by John Musa in this volume 
statistically model such phenomena as the mean-time-between-failures or the probability of a 
failure within a given unit of time. As John indicates, one of the parameters required as input to 
this model is the' number of errors existing in the software. 

There are several ways to estimate the number of errors in a piece of software. One is the actuarial 
approach which assumes there are so many errors in a given number of lines of code. A number 
frequently passed about is one error per one hundred lines. This approach assumes that all soft- 
ware is created equal and ignores the advances that have been made during recent years in analyzing 
software characteristics. An alternative approach recognizes these gains in relating software char- 
acteristics to such factors as the error-proneness of a section of code or the difficulty which will 
be -experienced in maintaining the code. The purpose of this paper is to review recent research on 
software complexity metrics to determine whether knowing something about software character- 
istics improves our ability to predict the number of errors it contains or the amount of effort re- 
quired to maintain it. 

If we can validate the use of software metrics for predicting the number of errors in software and 
the difficulty experienced in correcting them, then such metrics will prove a valuable addition to 
both quality assurance and management information systems. During the design phase, metric 
values can be estimated from relevant design information to predict problems which will be ex- 
perienced during coding. Values computed on the actual code can be used in predicting testing 
results, number of delivered bugs, and ease of maintenance. Although a large number of metrics 
have been presented in the literature, two seem to have received the most attention in empirical 
research. I will focus on these two metrics in the remainder of this paper. 

Thomas McCabe (1976) developed a complexity measure based on the cyclomatic number from 
graph theory. McCabe counts the number of regions in a graph of the control flow of a computer 
program. His metric represents the number of basic control path segments which when combined 
will generate every possible path through the program. Thus, McCabe has measured the complexity 
of the control structure. Schneidewind and Hoffmann (1979) demonstrated that the cyclomatic 
number and the reachability measure which can be computed from it were superior to the number 
of source statements in predicting the number of errors in a section of code and the time required 
to find and fix them. Feuer and Fowlkes (1979) also demonstrated that the node count was re- 
lated to the time to repair errors. However, their data indicated that different prediction equations 
should be used with different types of errors. Separate prediction equations might be possible 
when we have (1) developed more robust error classification schemes, and (2) progressed past 
predicting gross errors to predicting types of errors. 
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Another approach to software complexity was presented by Maurice Halstead (1977) in his theory 
of Software Science. Halstead maintained that the amount of effort required to generate a pro- 
gram can be derived from simple counts of distinct operators and operands and the total fre- 
quencies of operators and operands. These quantities can be used to calculate the number of 
mental comparisons required to generate a program. Halstead’s effort metric, E, expresses the 
complexity of computer software in psychological terms. Halstead also developed a metric to 
estimate the number of delivered errors in a system. This metric is based on the notion that 
there is a limited amount of code that a programmer can mentally grasp at a single time. When 
a section of code exceeds this value it is hkely that the programmer made at least one mistake in 
producing it. Halstead predicts the number of errors by dividing the total volume of code by this 
critical level for error-prone code. 

Bell and Sullivan (1974) presented a scatterplot which suggested that there was some validity to 
Halstead’s notion of a critical value for error-free code. In their data no program with a Halstead 
volume above 260 was error-free, while only one program below this level had an error. Sub- 
sequently, both Cornell and Halstead (1976) and Fitzsimmons and Love (1978) found correlations 
of 0.75 and above between Halstead’s metrics and the number of errors found in various software 
products. In a debugging study we recently completed at G.E. (Curtis, Milliman, and Sheppard, 
1979) the Halstead and McCabe metrics were better predictors of the time required to find a bug 
than was lines of code. 

In studying some error data provided us by Rome Air Development Center, Phil Milliman and I 
(1979) found Halstead’s metric a remarkably accurate predictor of delivered bugs in a system 
developed with modern programming practices and tools. However, the prediction was poor in a 
system developed with conventional techniques. The types of errors experienced in the former 
system were typical when compared to the types of errors reported in other systems (in particular 
to several reported by TRW). Phil and I also observed that the error ratio reported during the 
final months of development was an excellent predictor of post-development test errors. The 
error ratio represents the number of failed runs divided by the total number of runs. We observed 
a linearly decreasing trend in the error ratio during the final 9 months of development. When we 
extrapolated this trend into post-development testing, we observed a good prediction of the num- 
ber of errors detected. 

We suspect from the data we have observed that the prediction of errors and maintenance re- 
sources will be more accurate oh projects guided by modern programming practices. We believe 
that such practices will reduce the amount of variation in performance and quality resulting from 
such sources as individual differences among programmers, the programming environment, etc. . 
That is, a structured discipline constrains the amount of variation in the way software is developed. 
Since this variation is a source of error in predictions, the ability to predict various software- 
related criteria (such as number of errors) should improve. 

Based on the brief review of empirical research presented here, I propose the following conclusions, 
but agree that much more data is needed to substantiate them. 

• Measures of software characteristics can be used to predict the number of errors in a 
portion of code and the effort required to find and correct them. Such measures 
will be more valuable than an actuarial approach based on lines of code. 

• Different predictive plots may be observed for different classes of errors (computational, 
logic, interface, etc.) 
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• Metrics should be calculated at the appropriate level (subroutine, module, etc.) for 
explaining the results. 

• The prediction of software reliability and of maintenance requirements can begin early 
in the software development cycle, and improvements can be made and monitored if 
feedback is provided for improving software quality. 
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EQUATION: 


DESCRIPTION: 


V(G) *= U EDGES - # NODES + 2(# CONNECTED 

components) 


•OR 


V(G) = # PREDICATE NODES t 1 
OR 


McCabe's metric represents the number 
OF linearly independent CONTROL PATHS 
COMPRISING A PROGRAM. ThAT IS, THE 
NUMBER OF BASIC CONTROL PATH SEGMENTS 
WHICH WHEN COMBINED WILL GENERATE 
EVERY POSSIBLE PATH THROUGH THE PROGRAM. 

McCabe's v(G) represents a measure of 

COMPUTATIONAL COMPLEXITY. 


V(6) = ti REGIONS IN A PLANAR GRAPH OF THE 

CONTROL FLOW. 








B. Curtis 
G.E. 

9 of 22 


GENERAL ELECTRIC 
COMPANY 


SCHNEIDEWIND AND HOFFMANN'S 
DATA (1979) 


INFORMATION SYSTEMS 
PROGRAMS 


hsJ 




SPACE DIVISION 



maa 


SOFTWARE MANAGEMENT 
RESEARCH 


CORRELATIONS 


PREDICTOR 

NUMBER OF 
PROCEDURES 

U OF 
ERRORS 

FIND 

TIME 

FIX 

TIME 

CYCLOMAT I C NUMBER 
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.59 

.59 
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HjN2 (Nj + N2) L0G2 (Hj + H2) 

E 

l\^i 

WIIEREj 

# OF UNIQUE OPERATORS 
H2 =■ # OF UNIQUE OPERANDS 
= F OF OPERATORS 
N 2 “ F OF OPERANDS 


DESCRIPTION ; 

The amount of effort required to generate 
A program can be DERIVED FROM SIMPLE COUNTS 
OF DISTINCT OPERATORS AND OPERANDS AND THE 
TOTAL FREQUENCIES OF OPERATORS AND OPERANDS. 

These quantities can be used to calculate 

THE NUMBER OF MENTAL COMPARISONS REQUIRED 
TO GENERATE A PROGRAM. HaLSTEAD's EFFORT 
METRIC^ E, EXPRESSES THE COMPLEXITY OF 
COMPUTER SOFTWARE IN PSYCHOLOGICAL TERMS. 
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_ _Vx 

" 13,824 

WHERE, 

V = VOLUME 

Ecrit the mean number of ELEMENTARY DISCRIMINATIONS 
BETWEEN POTENTIAL ERRORS IN PROGRAMMING 


^ = LEVEL OF THE IMPLEMENTATION LANGUAGE 
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170.3 

102 

102 

15.3 

18 

20 

322.6 

196 

156 

28.2 

26 

30 

100.2 

71 

71 

65.5 

37 

59 

6.5 

16 

11 

58.5 

50 

50 

135.9 

80 

88 

903.0 

596 
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E WITH ERRORS 

COMMAND EXECUTIVE 

A 7 

70-7100 

53,920 

.81 

DATABASE MANAGER 

92 

10-6050 

69,910 

.75 

REPORT GENERATOR 

51 

50-3700 

97,950 

.75 

TOTAL 

190 

• 10-7100 

166,280 

.77 
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E 

V(6) 

LENGTH 

INTERRELATIONSHIPS 

v(g) 

LENGTH 

.56*** 

.90*** 


TIME TO FIND BUG: 

TOTAL PROGRAM 

.75*** 

.65*** 

.52*** 

SUBROUTINE 

.66*** 

.63*** 

.67*** 


NOTE: N « 27 

••• p < .001 
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• MEASURES OF SOFTWARE CHARACTERISTICS CAN BE USED TO PREDICT THE 
NUMBER OF ERRORS IN A PORTION OF CODE AND THE EFFORT REQUIRED 
TO FIND AND CORRECT THEM 

$ DIFFERENT PREDICTIVE PLOTS WILL BE OBSERVED FOR DIFFERENT CLASSES 
OF ERRORS 

ft THERE ARE OPTIMAL LEVELS IN THE CODE FOR CALCULATING METRICS 

ft THE PREDICTION OF SOFTWARE RELIABILITY AND MAINTENANCE REQUIREMENTS 
CAN BEGIN EARLY IN THE SOFTWARE DEVELOPMENT CYCLE^ AND IMPROVEMENTS 
CAN BE MONITORED 
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